Associating Relevant Photos to Georeferenced Textual Documents through Rank Aggregation
نویسندگان
چکیده
The automatic association of illustrative photos to paragraphs of text is a challenging cross-media retrieval problem with many practical applications. In this paper we propose novel methods to associate photos to textual documents. The proposed methods are based on the recognition and disambiguation of location names in the texts, using them to query Flickr for candidate photos. The best photos are selected with basis on their popularity, on their proximity, on temporal cohesion and on the similarity between the photo’s textual descriptions and the text of the document. We specifically tested different rank aggregation approaches to select the most relevant photos. A method that uses the CombMNZ algorithm to combine textual similarity, geographic proximity and temporal cohesion obtained the best results.
منابع مشابه
Geographically-aware Cross-media Retrieval for Associating Photos to Travelogues
Textual documents published on the Web where people describe traveling experiences, usually called travelogues, can provide interesting information about the experiences lived by the respective authors while traveling. Nowadays, several websites can be used for sharing these textual documents, and the use of Web information for travel planning has also increased. Still, the usage of the travelo...
متن کاملAn approach to graph-based analysis of textual documents
In this paper a new graph-based model is proposed for the representation of textual documents. Graph-structures are obtained from textual documents by making use of the well-known Part-OfSpeech (POS) tagging technique. More specifically, a simple rule-based (re)classifier is used to map each tag onto graph vertices and edges. As a result, a decomposition of textual documents is obtained where t...
متن کاملManifold Learning for Rank Aggregation
We address the task of fusing ranked lists of documents that are retrieved in response to a query. Past work on this task of rank aggregation often assumes that documents in the lists being fused are independent and that only the documents that are ranked high in many lists are likely to be relevant to a given topic. We propose manifold learning aggregation approaches, ManX and v-ManX, that bui...
متن کاملTemporal-Textual Retrieval: Time and Keyword Search in Web Documents
As the web ages, many web documents become relevant only to certain time periods, such as web-pages containing news and events or those documenting natural phenomena. Hence, to retrieve the most relevant pages, in addition to providing the relevant keywords, one may desire to identify the relevant time period(s) as well, e.g., “Barack Obama 1980-1985”. Unfortunately, not much work has been done...
متن کاملHybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents
There is a significant commercial and research interest in locationbased web search engines. Given a number of search keywords and one or more locations that a user is interested in, a location-based web search retrieves and ranks the most textually and spatially relevant web pages. In this type of search, both the spatial and textual information should be indexed. Currently, no efficient index...
متن کامل